Efficient Implementation of the Improved Quasi-Minimal Residual Method on Massively Distributed Memory Computers
نویسندگان
چکیده
For the solutions of linear systems of equations with un-symmetric coeecient matrices, we has proposed an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a couple two-term procedure that generates Lanczos vectors scaled to unit length. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for inner product can be overlapped eeciently with computation time. Therefore, the cost of global communication on parallel distributed memory computers can be signiicantly reduced. In this paper, we describe an eecient implementation of this method which is particularly well suited to problems with irregular sparsity pattern. The corresponding communication cost is independent of the sparsity pattern with several performance improvement techniques such as overlapping computation and communication, balancing the computational load. The performance is demonstrated by numerical experimental results carried out on massively parallel distributed memory computer Parsytec GC/PowerPlus.
منابع مشابه
The Improved Quasi - Minimal Residual
For the solutions of linear systems of equations with unsymmetric coeecient matrices, we propose an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a coupled two-term procedure that generates Lanczos vectors normali...
متن کاملParallel IQMR Method for Unsymmetric Large and Sparse Linear Systems in Computational Fluid Dynamics
| We mainly examine the application of the improved version of the quasi-minimal residual (IQMR) method 20], 21] for the solutions of linear systems of equations with unsymmetric coeecient matrices arising from the discretization of uid dynamic problems on massively parallel distributed memory computers. We will deal with implicit nite diierence schemes for solving the Euler equations. These sc...
متن کاملEfficient Implementation of the Improved Unsymmetric Lanczos Process on Massively Distributed Memory Computers
For the eigenvalues of a large and sparse unsymmetric co-eecient matrix, we have proposed an improved version of the unsymmetric Lanczos process combining elements of numerical stability and parallel algorithm design. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for inner produc...
متن کاملPerformance Evaluation of the Improved Quasi - Minimal Residual Method
For the solutions of linear systems of equations with unsym-metric coeecient matrices, we have proposed an improved version of the quasi-minimal residual (IQMR) method 18] by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a couple two-term procedure that generates Lanczos vecto...
متن کاملParallel Execution Time Analysis for Least Squares Problems on Distributed Memory Architectures
In this paper we study the parallelization of PCGLS, a basic iterative method which main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for solving least squares problems on massively paralle...
متن کامل